Symbolic Machine Learning: a Different Answer to the Problem of the Acquisition of Lexical Knowledge from Corpora
نویسنده
چکیده
One relevant way to structure the domain of lexical knowledge (complex terms, or relations between lexical units) acquisition from corpora is to oppose numerical versus symbolic techniques. Numerical approaches of acquisition exploit the frequential aspect of data, and use statistical techniques, while symbolic approaches exploit the structural aspect of data, and use structural or symbolic information. Methods from this former approach have been widely used and produce portable, robust, and fully automatic systems. They provide however poor explanations of their results, and may have difficulties to grasp very specific relations. The symbolic approach groups two strategies. The first one is the symbolic linguistic approach, in which operational definitions of the elements to acquire are manually established by linguists ―usually in the form of morpholexical patterns that carry the studied terms or relations―, or by a list of linguistic clues. However, when such patterns or clues are unknown, but examples of elements respecting the target terms or relation are known, techniques from the second strategy of this symbolic approach can be used, i.e. symbolic machine learning (ML) methods. This facet of this approach, far less known and employed, is just beginning to appear and widen in the natural language processing community. The aim of this paper is to point out the interest of such techniques, and to show how they can be used to infer efficient and expressive extraction patterns of complex terms or lexical relations from examples of elements that verify the target relations or the form of the terms. However, these techniques are often supervised, i.e. require to be (manually) fed by examples. We also explain that one method from each of the numerical and symbolic ML approaches can be combined in order to keep advantages from both: meaningful patterns, efficient extraction and portability.
منابع مشابه
The Comparison of Computer Assisted Teaching and Traditional Explicit Method in Learning / Teaching English Vocabulary.
This review surveys research on second language vocabulary teaching and learning since1999. It first considers the distinction between incidental and intentional vocabulary learning.Although learners certainly acquire word knowledge incidentally while engaged in variouslanguage learning activities, more direct and systematic study of vocabulary is also required.There is a discussion of how word...
متن کاملدستهبندی پرسشها با استفاده از ترکیب دستهبندها
Question answering systems are produced and developed to provide exact answers to the question posted in natural language. One of the most important parts of question answering systems is question classification. The purpose of question classification is predicting the kind of answer needed for the question in natural language. The literature works can be categorized as rule-based and learning...
متن کاملEmotion Detection in Persian Text; A Machine Learning Model
This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...
متن کاملThe Effect of Raising Morphological Decomposition Awareness on Lexical Knowledge of Complex English Words
Lexical knowledge of complex English words is an important part of language skills and crucial for fluent language use. This study aimed to assess the role of morphological decomposition awareness as a vocabulary learning strategy on learners’ productive and receptive recall and recognition of complex English words. University students majoring English at the...
متن کاملThe production of lexical categories (VP) and functional categories (copula) at the initial stage of child L2 acquisition
This is a longitudinal case study of two Farsi-speaking children learning English: ‘Bernard’ and ‘Melissa’, who were 7;4 and 8;4 at the start of data collection. The research deals with the initial state and further development in the child second language (L2) acquisition of syntax regarding the presence or absence of copula as a functional category, as well as the role and degree of L1 influe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005